* 1. Describe the dataset to understand the variables
* The first step in any data analysis is to know your data. 
* The 'describe' command gives you an overview of all variables in your dataset: their names, types, and labels.
describe

* 2. Summarize the dataset to get a sense of the data distribution
* Now that we know what variables we have, let's see how they are distributed.
* The 'summarize' command provides basic statistics like the mean, standard deviation, min, and max for each variable.
summarize

* 3. Generate descriptive statistics for demographic variables
* Understanding the demographics of our study participants is key.
* We'll summarize age, height, weight, and BMI to get a picture of the physical characteristics of the climbers.
summarize age_years height weight bmi

* 4. Calculate the prevalence of smoking, marijuana use, and alcohol consumption
* Lifestyle factors like smoking, marijuana use, and alcohol consumption can affect injury risk.
* We'll use 'tabulate' to calculate the prevalence of these behaviors among our climbers.
tabulate smoking
tabulate marijuana
tabulate alcohol

* 5. Analyze climbing characteristics: Climbing level, hours per week, years of climbing
* Let's now focus on the climbing-specific characteristics.
* How experienced are these climbers? How much do they climb each week? What is their climbing level?
* These questions are answered by summarizing the relevant variables.
summarize years_climbing weekly_climbing_hours
tabulate climbing_level

* 6. Analyze the Range of Motion (ROM) for Dominant and Non-Dominant limbs
* Range of motion (ROM) is crucial for understanding the physical readiness of climbers.
* We'll summarize ROM for both dominant and non-dominant limbs, focusing on shoulder, elbow, and wrist movements.
summarize ds_flex ds_ext ds_ri ds_re nds_flex nds_ext nds_ri nds_re
summarize de_flex de_ext de_pro de_supi nde_flex nde_ext nde_pro nde_supi
summarize dw_flex dw_ext dw_rad dw_uln ndw_flex ndw_ext ndw_rad ndw_uln

* 7. Compare ROM between Dominant and Non-Dominant limbs
* To see if there's a significant difference in ROM between the dominant and non-dominant limbs, 
* we'll use the Wilcoxon signed-rank test. This test is appropriate for comparing paired data like this.
signrank ds_flex = nds_flex
signrank ds_ext = nds_ext
signrank ds_ri = nds_ri
signrank ds_re = nds_re
signrank de_flex = nde_flex
signrank de_ext = nde_ext
signrank de_pro = nde_pro
signrank de_supi = nde_supi
signrank dw_flex = ndw_flex
signrank dw_ext = ndw_ext
signrank dw_rad = ndw_rad
signrank dw_uln = ndw_uln

* 8. Calculate prevalence of pain locations
* Injuries often manifest as pain in specific locations.
* We'll use 'tabulate' to understand where climbers most frequently report pain.
tabulate pain_current_location_shoulder
tabulate pain_current_location_spine
tabulate pain_current_location_hand
tabulate pain_current_location_elbow
tabulate pain_current_location_ankle_feet
tabulate pain_current_location_no_pain

* 9. Calculate prevalence of specific injuries
* Just as with pain, it's important to know the prevalence of specific injuries.
* 'tabulate' will give us the frequency of each injury reported by the climbers.
tabulate conf__sprained_ankle
tabulate conf__tennis_elbow
tabulate conf__pulley_lesion
tabulate conf__shoulder_injury
tabulate conf__fracture
tabulate conf__joint_laxity
tabulate conf__knee_injury
tabulate conf__wrist_injury
tabulate conf__lombar_pain
tabulate conf__minor_muscle_sprain
tabulate conf__no_diagnosis

* 10. Analyze Shoulder Subluxation and its association with dynamic climbing style
* Is there a relationship between shoulder subluxation and a dynamic climbing style?
* We can explore this using a Fisher's exact test, which is appropriate for small sample sizes.
tabulate confirmed_shouldersubluxation movement_preference, exact

* 11. Perform t-tests and chi-square tests for group comparisons
* We'll use t-tests to compare the years of climbing experience between climbers with and without shoulder subluxation.
* Additionally, we'll use chi-square tests to explore associations with gender.
ttest years_climbing, by(confirmed_shouldersubluxation)
tabulate gender confirmed_shouldersubluxation, chi2

* 12. Analyze the association between movement preference and shoulder subluxation using Fisher's Exact Test
* This will give us a more detailed understanding of the relationship between climbing style and shoulder subluxation.
tabulate movement_preference confirmed_shouldersubluxation, exact

* 13. Calculate cumulative prevalence difference and prevalence ratio using the 'cs' command
* The 'cs' command allows us to calculate the prevalence ratio and difference, giving us insight into the strength of the association.
cs confirmed_shouldersubluxation movement_preference

* This will output the prevalence ratio and the prevalence difference

* 14. E-value calculation (Note: Requires external calculation)
* To assess the robustness of our findings, we calculate an E-value.
* This quantifies how strong an unmeasured confounder would have to be to explain away the observed association.
* The E-value is calculated using the methodology proposed by VanderWeele and Ding.
* Please use the following citations when referring to this analysis:
* Mathur MB, Ding P, Riddell CA, VanderWeele TJ (2018). Website and R package for computing E-values. Epidemiology, 29(5), e45-e47.
* VanderWeele TJ & Ding P (2017). Sensitivity analysis in observational research: introducing the E-value. Annals of Internal Medicine, 167(4), 268-274.

